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ABSTRACT 

As part of a larger study investigating intervention 
procedures for children classified as homogeneous on factorially 
derived dimensions of classroom behavior, students in grades 1-3 
(N=1,067) were screened using teacher ratings on the Walker Problem 
Behavior Identification Checklist (WPBIC) for the purpose of 
developing groupings of deviant classroom behavior using behavioral 
assessment procedures and factor analytic techniques. Each S's 
ratings on the WPBIC were scored on five factors and subjected to 
profile analysis. Homogeneous groupings were established on the five 
behavioral dimensions: acting-out, social withdrawal, 
distractability, disturbed peer relationships, and immaturity. 
Correlations indicated that, with the exception of acting-out and 
distractability, there was little overlap among item clusters 
comprising the five factors. Sex difference was significant within 
each of the three grade levels; neither grade level effect nor 
interaction between grade level and sex was significant. Results 
suggested that teacher checklist ratings of student behavior are a 
valuable and relatively inexpensive method of identifying homogeneous 
groupings of classroom behavior. (KW) 
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This report describes a normative Identification study which had 
two primary objectives. These were: (1) to collect normative, behav- 

ioral assessment data on different subgroupings of deviant/def icient 
classroom behavior; and (2) to screen children in grades one, two, and 
three as possible candidates for a school-based intervention program 
designed to remediate classroom behavior problems. 

Brief literature reviews are also provided In areas related to the 
Identification of behavior problem children in the classroom setting. 
Areas reviewed Include: (1) teacher , ratings of classroom behavior, (2) 

early Identification of behavior problem children, (3) factor analysis 
studies of classroom behavior dimensions, and (4) behavioral assessment 
and grouping for differential treatment. 

The Teacher as an Observer of Classroom Behavior 

The value and need for reliable identification of children with 
behavior problems seems to be generally accepted by educators and 
psychologists. However, since the publication of Wicktnan ' s 1928 mono- 
graph comparing the attitudes of teachers and clinicians toward behavior 
problems of children, the teacher's role in the identification process 
has been viewed with some equivocation. Hickman found discrepancy rank- 
order correlations of -.22 and -.11 respectively between the rankings of 
teachers and those of thirty mental health specialists on the relative 
seriousness of various problem behaviors of school children. Clini- 
cians viewed social withdrawal and other anti-social forms of behavior 
as more serious, in terms of pathology, than did teachers. Teachers 
were more concerned with behaviors disruptive of classroom order, dis- 
cipline, and achievement (Wlckman, 1928). Since, In this study, the 
judgments of psychologists were accepted as a criterion against which 
teacher judgments were compared, the lack of agreement between these 
two groups raised serious questions about the competence of teachers 
In Identifying disturbed children. On the other hand, Hickman's 
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research methodology has drawn considerable criticism that has cast 
some doubt upon the credibility o£ his findings. Watson (1933), for 
example, points out that teachers and clinicians were given different 
Instructions for the rating/ranking task in the Wickman study. Teachers 
were Instructed to rank behaviors for present seriousness, while clini- 
cians were asked to rank them according to their importance or influ- 
ence in handicapping a child's future adjustment. 

Stouffer (1952) reported a study in which he used essentially 
the same research design as Wickman. His. study demonstrated a much 
closer agreement, positive rho of .61, between teachers and mental 
hygienists in their ranking of the relative seriousness of children's 
behavior problems. In addition, Stouffer reported a ranx order corre- 
lation of .87 between the ratings of his and Wickman' s mental hygienists. 

, Stouffer concluded that teachers' attitudes toward children's behavior 
problems had changed considerably since Wickman 's study and had become 
more like those of psychologists. 

Studies by Hunter (1957) and Ullman (1952) were also reported in 
the fifties, which showed greater congruence between teachers and 

! 

mental health experts in their evaluations of childhood behavior prob- 

/ " | 

lems than was true at the time of Wickman 's study. Schrupp and Gjerde 
(1953) in a replication of Wickman 's research design, found much more 
agreement between teachers and clinicians than was indicated in studies 
reported during the late 1920's. The authors qualified their findings 

by pointing out that disagreements were still evident, and that the >! 

{ 

direction of the disagreements was similar to that found by Wickman. 

) 

» 

Different results were reported in studies by Clark (1951) and \ 

Peck (1955). Peck's study revealed that teachers viewed undesirable 



personality traits as the most seriously handicapping of behaviors; 
regressive traits were slightly less serious; and aggressive behavior 
was rat'ftd as least serious. Clark concluded from the results of his 
study that teachers are more disturbed by children's behaviors which 
annoy other children than by behavior that affect teachers directly. 

In the early sixties* Sarason (1960) and his associates maintained 
that developing personality measures to identify children whose anxiety 
levels are interfering with a productive use of their potential is 
Important because teachers do not perform this function to a satisfactory 
degree. Sarason suggests teachers do not have either the time or the 
training to act as psychological diagnosticians. 

Bower (1960) also used clinicians' judgments of emotionally dis- 
turbed children as a criterion variable to evaluate teachers' judgments 
of the same sample on dimensions of emotional disturbance. Bower found 
a very close relationship between teachers’ and clinicians' judgments 
of emotional disturbance. In this study, teachers identified 87% of 
clinically identified children and identified a greater number of 
children as overly withdrawn or timid than as overly aggressive or 
defiant. Evidence from this study appears to refute the oft-cited 
criticism that teachers tend to ignore withdrawn children whose behavior 
may not be as disruptive or disturbing as that of an acting-out, 
aggressive child. 

Beilin (1959) has reviewed research from 1°27 that relates to 
the validity of teachers' identification of children with behavior 
problems. His Interpretation of research findings suggests that 
teachers have become more like clinicians in making judgments about 
children. Beilin feels teachers and clinicians will likely always 
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differ In basic attitude. Ter.chars , because they are task-oriented, 
will probably focus wore on problems disruptive of achievement than 
will clinicians. Clinicians, on the other hand, are traditionally 
more concerned with the child's adjustment. Thus, according to Beilin, 
they are more likely to identify withdrawn children even though they 

may be achieving satisfactorily (Kennedy, 1965). 

\ 

Maes (1966) reported a study x«hlch demonstrates that emotionally 
disturbed children in grades four, five, and six can be identified as 
effectively through the use of a teacher rating scale and a group 
intelligence test as through the use of these two sources of information, 
in addition to arithmetic achievement, reading achievement, a modified 
soclometrlc technique (a class play), and a self-concept inventory. 

The predictive efficiency that Maes achieved with two variables (teacher 
ratings and intelligence estimates) equalled that which Bower demon- 
strated with the use of six variables. This procedure makes the iden- 
tification process considerably more efficient and lends further 
support to Bower's finding that teacher judgment is an important 
variable in the identification of children with behavior problems. 

Mathew Trlppe (1961), in a discussion of the teacher's role in the 
identification of children with behavior problems, argues that competent 
teachers are the most qualified judges of disturbed behavior in the 
school setting. He notes that requiring the judgments of teachers to 
be validated against the judgments of clinicians falls to recognize 
the role of teaching as different from the clinician's role of diag- 
nosing and administering treatments. Failure to distinguish between 
these roles has resulted in some concern that teachers might indiscrim- 
inately label children as deviant or disturbed. He suggests there is 
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no evidence to support this and that If a variety of school options 
were available, teachers' attention to children with behavior problems 
would result, not In the treatment of an Illness, but In a better 
placement for the child. 

Thus, evidence exists that teachers are In much closer agreement 
with mental health specialists In their judgments of classroom behavior 
problems than was true thirty years ago. Although some questions are 
still raised about the validity of teacher judgments of childhood ad- 
justment problems, as well as the wisdom of using clinicians' and 
mental hygienists' judgments as a validation criterion, there appears 
to be a general recognition that the classroom teacher's vantage point 
Is an especially good one for the Initial stages of Identifying such 
children. In fact, the classroom teacher Is In a unique position to 
Identify children with behavior problems, since she spends more time 
In actual observation of the child than any other school personnel 
(Kennedy, 1965) . 

A number of researchers have designed identification systems and 
procedures that rely heavily upon the teacher's judgment of classroom 
behavior problems. (Becker, 1960; Cromwell, 1965; Dreger, 1964; Ross, 
Lacey, and Parton, 1965; Bower, 1960; Quay and Quay, 1965; Zax, Cowen, 
Izzo, and Trust, 1964; Splvack and Swift, 1966; Swift and Splvaclc, 

1968; Phillips, 1968; and Walker, 1969). Ilany of the Identification 

Instruments used in these studies consist of stimulus Items that des- 

cribe behaviors which Interfere or actively compete with successful 
} academic performance. According to Beilin (1959), teachers are most 

v concerned with classroom behavior which is disruptive of achievement. 

f, Since the teacher Is held responsible for the child's achievement 

*9 through the teaching-learning process, she should be an excellent 



judge o£ classroom behavior that is Incompatible with academic 
performance. 
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The Case For Early Identification 

The need for early identification of children with learning (Haring 
and Ridgway, 1967; and Fitzsimmons, Cheever, Leonard, and Macunovich, 
1969) and behavioral (Robins, 1966; Bower, 1960; Cowen, Zax, Izzo, and 
Trost, 1966; O'Neal and Robbins, 1953; Zax and Cowen, 1967; Zax, Cowen, 
Izzo, Madonla, Merenda, and Trost, 1966; and Cobb, 1970) problems has 
received increasing attention in the last few years. Evidence from the 
above studies suggests that children with academic and behavioral dif- 
ficulties can be identified early in their school careers. Fitzsimmons, 
Cheever, Leonard, and Macunovicn (19-59) analyzed the academic histories 
of 270 students from elementary through secondary school using pattern 
analysis and nonparametric techniques. Their analyses revealed that a 
majority of academically unsuccessful students (high school drop-outs — 
poorly performing graduates) could be identified as early as the third 
grade. The authors indicate that by the second grade, 50% of the 270 
students in the sample had experienced their first academic failure. 

By the fourth grade, 75% had experience their first failure, and 90% 
by the seventh grade. The most critical areas of initial difficulty- 
were in the basic skills subjects of language and mathematics. 

Of greater significance was the finding that over 40% of the 
student records demonstrated a spread pattern (initially failing in 
only one or two academic areas). Distribution of the spread patterns 
through the school years showed a fairly consistent pattern beginning 
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with an initial Increase in the first three years (usually in the basic 
skills area) ; then showing a more gradual rise over the years (spread- 
ing to science and social studies areas); and finally reaching a high 
point in the ninth and tenth grades. The spread effect indicates that 
many children give early warning signs of serious academic difficulties, 
and that they are likely to fall further and further behind in their 
academic skills the longer they remain in school. Although they 
present no data to support the hypothesis, the authors contend that 
the findings of their study suggest that intervention early in the 
child's academic career has more impact than later intervention. The 
authors reinforce their point by referring to the longitudinal re- 
search (not quoted or documented) which suggests that the further 
along in a student's career, the greater is the amount of "end-career" 
variance already accounted for. 

They recommend special remedial programs for the first four grades 
in which children experiencing academic difficulties would be assisted 
in improving the quality of their achievement. They argue further that 
particular attention should be paid to the basic skills area to pre- 
vent a spread of performance difficulty to other areas. 

In the area of behavioral disturbance, several studies have 
demonstrated correlational relationships between behavioral problems 
evidenced early in the child's school career and later maladjustment. 
Stennett (1965), for example, found that with the passage of time, 
school children identified as emotionally handicapped performed signif- 
icantly less well than their peers. West man, Rice, and Bermann (1968) 
reported a correlation of .88 between maladjustment ratings of 
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children early in their school careers and their subsequent use of mental 
health services (Z ax, Cowen, Rapnaoort, Beach, and Laird, 1968). 

Zax, Cowen, Rappaport, Beach, and Laird (1968) reported a study 
in which they identified two consecutive groups of first grade children 
having a high potential for being emotionally disturbed. Children 
manifesting high potential for being disturbed (labeled Reg Tag) we re 
compared on school record and special test measures, with peers evi- 
dencing low potential for disturbance (labeled Non-Red Tag). The 
measures reflected achievement, classroom behavior, peer perceptions, 
attendance, and school nurse referrals. Forty-seven comparisons were 
made between the Red Tag and Non-Red Tag groups in the seventh grade. 

Ten of the 47 differences were statistically significant beyond the 
.05 level, which is a greater number than would be expected on a 
chance basis. All Red Tag children scored more negatively than the 
Non-Red Tag children on all significant differences. In addition, 
of the 37 non-significant differences, 30 found the Red Tag children 
scoring more negatively. 

Forty-two comparisons were made between the second group of Red 
Tag and Non-Red Tag children identified in the first grade. Thirteen 
of the 42 differences were statistically significant, and the Red Tag 
group scored more negatively than the Non-Red Tag group on all signif- 
icant comparisons. As with the first group, a majority of the non- 
significant comparisons (25 of 29) favored the Non-Red Tag group. The 
findings of this study have important implications for current special 
educational practices. The results suggest that children experiencing 
behavioral difficulties can be identified in the first grade. Further, 
the data indicate that behavior problems identified in the first year 
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remain stable over time. Problems identified in grade one have a high 
probability of being Identified sevcu years later. The specific pro- 
blem behaviors may change over time. However, evidence for the 
stability of behavior disturbance appears to. be quite strong. The 
authors argue that, "... if, as seems valid on the face of it, children 
who manifest signs of poor adjustment are more likely than others to 
grow up lo be seriously disturbed, then considerable effort at early 
identification of potential for maladjustment and the development of 
programs to prevent this are justified." 

The work of Cobb (1970), in progress, on the identification and 
measurement of observable, achievement related behaviors in the first 
grade, is quite timely. His research design is sequential in that 
correlational relationships are established between predictor vari- 
ables (observable behaviors) and a criterion of measured academic 
achievement. The identified, achievement related behaviors will then 
be modified across children to determine if functional relationships 
exist between them and academic achievement. 

Thus, it appears a technology is developing that will allow the 
identification, prediction, and possible prevention of behavioral and 
academic difficulties in young children. Zax, Cowen, Rappaport, Beach, 
and Laird (1968) used an elaborate clinical procedure similar to Bower 
(1960), in identifying his groups of Red Tag children. They suggest 
the identification process in general needs further study as a source 
of information for the development of optimal prevention procedures. 
Additionally they argue that the identification process must be made 
more efficient and streamlined. Their procedure, as well as the ob- 
servation system developed by Cobb (1970) , is quite expensive in terms 
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of observer and teacher time. Both systems require training before they 
can be used effectively. It would appear that several levels of early 
screening in the school setting may prove functional as well as eco- 
nomical. Walker (1969) has described such a model in an earlier paper. 
It uses a 50 item behavior checklist as an initial screening device 
(requiring approximately five minutcs,to complete per child). High, 
scoring children are then selected for more intensive screening and 
evaluation using direct observation and recording procedures. 

Studies Using Factor Analysis and Clustering Techniques 

A number of recent studies have factor analyzed ratings of child 
behavior by teachers, parents, and clinicians in an attempt to isolate 
homogenous behavioral groupings. The number of factors obtained in 
these studies has varied from two (Peterson, 1965) to as many as thir- 
teen (Spivack and Swift, 1966). Peterson (1965), after reviewing a 
number of studies using child behavior scales, argues that two major 
factors account for the important variance in ratings of child behavior. 
The content of the first factor described by Peterson, relates to the 
behavioral dimensions undo. ’•lying the child'*? social adjustment. 

t 

The second factor describes behavioral dimensions associated with 
ax t rover p *on *• in trovers ■* on . Becker and Krug (1964) suggest that the type 
of factor analysis procedure used may determine the number of factors 
actually obtained. In ratings of child behavior, one typically finds 
two major centroid factors accounting for as much as half the variance, 
accompanied by a series of smaller factors. If, however, analytic rota- 
tional procedures are used (oblimax or varimax), Becker and Krug argue 
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that five to eight factors with reliable variance contribution are likely 
to be obtained. Thus, there appears to be a general lack of agreement 
among Investigators about the number of dimensions that are necessary 
and sufficient to account for behavioral differences among children 
(Sines, Pauker, Sines, and Owen In Press). There is little doubt, how- 
ever, that homogeneous groupings of ratings of child behavior can be 
identified and isolated (Becker and Krug, 1964; Patterson, 1964; Kulik, 
Stein and Sarbin, 1963; Ross, Lacey, and Farton, 1965; Sines, Pauker, 
Sines and Owen, In Press; Phillips, 1968; Quay, 1964; and Walker, 1970). 

Patterson (1964), for example, factor analyzed clinic ratings of 
a sample of 100 boys between the ages of seven and twelve years. The 
analysis procedure yielded five factors which the author labeled as 
hyperactive, withdrawn, immature, aggressive, and anxious. Patterson 
set up a profile analysis procedure based upon the factor structure. 

The homogeneity of the obtained factor profiles were then analyzed. 

The hyperactive, withdrawn, and aggressive profile groups were the most 
homogeneous with intra class correlations respectively of .55, .63, 

.52. The immature and anxious groups were less homogeneous with 
coefficients of .42 and .39. All five factor profile groups were more 
homogeneous than a sixth group of subjects, labeled random, with an 
intra class R of .11. 

Ross, Lacey, and Parton (1965) developed the Pittsburg Adjustment 
Survey Scales to provide for an objective evaluation of the social 
behavior of elementary school age boys, using the observations of 
classroom teachers. An initial item pool of 94 items was obtained 
through use of an extreme group procedure. Behavior ratings were 
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obtained on 209 boys in grades one through six. Each teacher in the 
sample rated one randomly selected boy in her class. A principal- 
components factor analysis of the data yielded five factors which 
accounted for 40% of the total variance or 71% of the estimated non- 
error variance. Factor V, which contained only one item with a loading 
in excess of .50, was dropped from the analysis. The remaining four 
factors were labeled aggressive behavior, withdrawn behavior, pro- 
social behavior, and passive-aggressive behavior. Additional analyses 
indicated that the factor scales discriminated among independently 
selected groups of aggressive, withdrawn, and well-adjusted school 
children. For example, a group of 18 aggressive boys received mean 
scores of 94.4 on aggressive beka'/ior; 11.1 on withdrawn behavior; 0.0 
on pro-social behavior; and 77.8 on passive-aggressive behavior. A 
group of 18 well-adjusted boys received mean scores of 5.6, 5.6, 33.3, 
and 11.1 respectively on the same factor scales. 

Kulik, Stein, and Sarbin (1968) constructed a self-report checklist 
of antisocial activities for analyzing patterns of delinquent behavior. 
The study had three objectives: (1) to establish the dimensionality 

of adolescent antisocial behavior, (2) to identify salient patterns of 
antisocial behavior among consistently delinquent boys, and (3) to 
demonstrate validity of dimensional and pattern analyses by relating 
dimensions and patterns to other variables. 

The 52 items of the checklist asked the subjects about a broad 
range of misbehaviors. Cluster analysis of the items on three different 
samples yielded four dimensions of antisocial behavior: delinquent role, 

drug usage, parental defiance, and assaultiveness. The checklist was 
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filled out by 505 high school boys and 391 boys at institutions for 
delinquents. The scores of delinquents and non-delinquents differed 
significantly on each of the four dimensions of antisocial behavior. 
Delinquent boys in the study were classified into seven empirical types 
based upon their score patterns on the four dimensions. The empirical 
types differed in racial composition and on other social and personal 
variables. 

Quay has conducted a number of factor analytic studies of ratings 
and case histories of adolescents, children in special classes, and 
delinquent boys (Quay, 1964; Quay, Morse, and Cutler, 1966). Quay 
has Identified four homogeneous factors or dimensions in these studies. 
They are inadequate-immature, neurotic-conflictcd, unsocialized 
aggressive or psychopathic and socialized or sub-cultural delinquency. 
Quay points out that these behavior dimensions occur in delinquent, 
emotionally disturbed, and "normal" populations. Differences among 
these three groups on the four dimensions are quantitative rather than 
qualitative. The magnitude of the scores varies from sample to sample, 
but the dimensions remain the same (Quay, 1970) . 

Sines, Pauker, Sines and Owen (In Press) developed the Missouri 
Children's Behavior Checklist which provides a set of descriptions of 
children's behavior that may be rated by a child's parent. The pur- 
pose of the study was to develop a method . . . "for Identifying groups 
of children, each of which would be at the extreme of one of several 
clinically or theoretically significant dimensions of children's 
behavior." The final form of the checklist consisted of 70 statements 
that were reduced from 95 descriptive behavioral statements. The 
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original behavior statements were selected from the existing literature 
to sample six dimensions of behavior: aggression, inhibition, hyper- 

activity, sleep disturbance, somatization, and sociability. Items were 
assigned to behavior dimensions if a point beserlal correlation between 
the item and the total dimension score was .30 or greater, and if the 
square of the point biserial r war. at least twice as large as the square 
of the r between that item and the total score or any of the remaining 
five factors. This analysis was completed on parental ratings of 404 

i 

boys between the ages of five and sixteen years. The means and standard 
deviations, on each of the six behavior dimensions, were compared for 
24 boys seen in a university child psychiatry clinic with a group of 
24 non-ref erred boys who were evaluated and classified as "normal" 
children. There were statistically significant differences between the 
two groups of boys on the checklist scales of aggression, inhibition, 
hyperactivity, and sociability. 

Walker (1970) factor analyzed behavior checklist ratings (by 
teachers) of 534 children in grades four, five, and six. Boys and 
girls were Included in the sample. The procedure yielded five factors 
that were subjected to a varlmax orthogonal rotation to obtain a sim- 
ple structure. The five factors were: acting-out, withdrawal, dis- 

tractablllty, disturbed peer relationships, and immaturity. Analyses 
revealed statistically significant differences in total checklist 
score between males and females across all three grade levels. Sta- 
tistically significant differences in checklist score were found between 
a group of emotionally disturbed children and a matched group of non- 
dlsturbed children. A profile analysis procedure, based on the factor 
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structure, was established to record and analyze scores on each of the 
factor scales. 

The factor analysis techniques employed In the above studies are 
useful in establishing the validity of an instrument, since they provide 
information about the content of a scale (what it measures). These pro- 
cedures also provide for a more detailed description of behavior through 
factorial, profile analysis techniques. The factorial dimensions 
identified in the above studies share a high degree of similarity in 
number as well as content. The strongest and most homogeneous factors 
in these studies appear to be aggression, withdrawal, and hyperactivity 
(Patterson, 1964; Sines, Paukcr, Sines, and Owen, In Press). Behavior 
dimensions associated with anxiety, immaturity, and disturbed peer re- 
lationships appear to be less well defined and less homogeneous, but 
still clearly Identifiable. Several of these studies denonstrate that 
different clinically identified or independently selected groups of 
children received differential ratings on the factorial dimensions. 

Thus, powerful evidence exists in the literature for the identification 
of homogeneous groupings of deviant behavior, as well as for the ex- 
ternal validity of such groupings. 

Consequently, it would appear that children receiving high scores 
on different behavior dimensions can be grouped for the purpose of 
providing differential treatments. However, the basis for grouping 
and for assignment to a treatment rests upon a rating by a teacher, a 
parent, or a clinician as to whether or not a given behavior is present 
in a child's repertoire. If the child receives a large number of 
deviant behaviors checked or rated on a factor scale, he is said to 
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score high on, and be representative of, the behavior dimensions mea- 
sured by that factor. Although such ratings are quite useful for Iden- 
tifying and locating specific populations of deviant children, they do 
not predict the actual rates i-7ith which these behaviors occur and there- 
fore provide little information for the development of treatments and 
remediation procedures. There have been no studies reported in the 
literature which demonstrate that actual rates of individual behaviors 
can be predicted or inferred from checklist ratings of whether the 
behavior is present or absent. In addition, no study has demonstrated 
a relationship between the number of behaviors indicated as present on 
a checklist and the rate of occurrence of such behaviors as measured by 
direct observation and recording procedures. 

It would appear that a homogeneous pool of subjects with respect 
to a given behavior dimension, such as hyperactivity or social with- 
drawal, would be highly variable in terms of the rate with which they 
produce the behaviors making up the behavior dimension. Some of the 
subjects would no doubt have very high rates; others moderate rates; 
and some low rates. Thus, in developing treatments for differential 
groupings of deviant behavior, it would seem necessary to also devglop 
homogeneous groupings with respect to the rate with which individual 
behaviors comprising the behavior dimension occur. For example, in 
developing a treatment for social withdrawal, an initial group of sub- 
jects could be identified on the basis of high scores received on a 
factor scale within a checklist which measures social withdrawal. The 
next level of screening, prior to assignment to treatment, would require 
the identification of a pool of subjects, from the initial group, who 
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have low rates o£ social Interaction. The second level of screening 
provides direct information for the development of intervention pro- 
cedures. Similar screening procedures could be established for such 
factors as aggression, hyperactivity, deviant peer relationships, dis- 
tractability, etc. using observation schedules. In summary, this model 
simply requires a more empirical definition of factorial homogeneity, 
and uses rate as a basis for assignment to treatment as well as for 
evaluating the effectiveness of Intervention. 

Assessment and Grouping for Differential Treatments 

In the field of behavior modification, intervention procedures 
have traditionally been designed to shape or modify the behavicr(s) of 
a single child. These single subject designs have focused upon precise 
analyses of the parameters of the target behavior(s) selected for 
modification. Intervention procedures have been adapted to the specific 
remediation requirements of the target behavior(s) as well as the rein- 
forcement preferences of the child. Dunn (1968), for example, has 
pointed out that the intervention program itself often becomes the 
diagnostic device. The success of this individualized approach to as- 
sessment and remediation has been Impressive. However, specific 
Intervention programs across children have thus varied as a result of: 
the target behaviors selected for remediation, situational variables 
associated with different treatment settings, and specific remediation 
requirements and reinforcement preferences of different target children. 
As a result, the large number of individual case studies and single 
subject designs reported in the literature have not resulted in 
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clearly validated techniques or procedures that have a predictable 
effect across children or across behaviors. There is evidence in the 
literature that such techniques as social reinforcement, token rein- 
forcement, and time-out procedures are effective in remediating behav- 
ioral deficits in specific instances. Nevertheless, there is no data 
to indicate under what treatment conditions, with what types of children, 
and across which behaviors are these techniques consistently effective 
in remediating behavior. 

Increasing attention is being given to the development of "group" 
intervention techniques that can be used simultaneously with a large 
number of children and that will have some generality of effect both 
across children and across behaviors (Packard, 1970; Walker and Buckley, 
1970; Walker, Mattson, and Buckley, In Press). It would appear that the 
effective education of behaviorally handicapped (as well as other types 
of handicapped children) requires the development and validation of 
intervention procedures that are effective; that have some generality 
of effect — both across children and across behaviors; that have some 
generality of effect over time; and that are reasonably economical in 
terms of per child cost. 

Quay (1968) has provided a framework for delivery of remediation 
services to handicapped children that focuses upon assessment, grouping, 
and remediation. Quay's model is somewhat unique in that children with 
learning or behavioral handicaps are assessed on a variety of education- 
ally relevant measures and then grouped for remediation and instruction 
according to their performance on these measures. Homogeneous group- 
ings are established on dimensions of educationally relevant performance 
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Instead of upon hypothetical medical or psycho-social correlates of handi- 
capping conditions. Thus, homogeneous groupings arc established for 
instructional -remedial purposes, across children and across handicapping 
conditions . 

Purpose of the Study 

The present study reports the behavioral assessment procedures and 
results for a larger study, the purpose of which was the development and 
evaluation of intervention procedures for children classified as homo- 
geneous on factorially derived dimensions of classroom behavior. Specific 
objectives of the study are: (1) to develop homogeneous groupings of 

maladaptive or deviant classroom behavior using behavioral assessment 
procedures and factor analytic techniques; (2) to experiment with inter- 
vention strategies based upon the assessment data, that are specifically 
designed for remediation of behavioral deficits isolated by the grouping 
procedure; (3) to measure the efficiency and effectiveness of the inter- 
vention strategies in remediating behavioral deficits and producing 
behavior change. 



Method 
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Assessment and Sample Selection Procedures 

The population of children in grades one, two, and three in the 
Eugene school system was screened using teacher ratings on the Walker 
Problem Behavior Identification Checklist (WPBIC) (Western Psychological 
Services, 1970). The school district required parental permission for 
completion of the ratings. A checklist was completed on each child for 
whom a signed permission slip was received. Of 5,500 children in grades 
one, two, and three, parental permission slips were received and teacher 
ratings were completed for 1,067 children. 
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Children who received a checklist score of 21 or above — one standard 
deviation above the mean of the normative sample — were assigned to a pool 
for possible selection as experimental subjects. Each subject's ratings 
(by his teacher) on the WPBIC were scored on five factors within the check- 
list and subjected to a profile analysis procedure* Through this pro- 
cedure, five pools of behaviorally homogeneous subjects were selected for 
further observation, screening and assessment. 

Observation schedules will be developed to provide more precise and 
more reliable measurement of the behavioral content of each factor. The 
observation schedules will be based upon the behavioral content of each 
factor in the checklist. These schedules will provide observation and 
recording of discrete units of behavior within the classroom setting. (An 
observation schedule for factor one, acting-out behavior, has been developed 
and is included as appendix one.) 

Each pool of behaviorally homogeneous subjects will be screened on 
the observation schedule developed for that factor. Subjects will then be 
drawn from this pool and assigned to an intervention procedure designed to 
remediate behaviors measured by that particular factor.* 

The observation schedules will serve three functions in this study: 

(1) checking and corroboration of the teacher's ratings of classroom behav- 
ior on the WPBIC; (2) providing additional measures of factorially homo- 
geneous behaviors through observation and recording of discrete behavioral, 
units; (3) providing a basis for evaluating the efficiency and effectiveness 
of experimental intervention procedures. Five pools of homogeneous subjects 
were established on the following behavioral dimensions: (1) Acting-out 

(disruptive, aggressive, defiant); (2) Social Withdrawal (restricted 
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*The larger research study will last five years. One year will be devoted 
to developing intervention procedures for each of the five groups. 
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functioning, avoidance behavior, low rates of peer interaction); (3) 
Distractability (short attention span, inadequate study skills, high rates 
of non-attending); (4) Disturbed Peer Relationships (inadequate social 
skills, high rates of coercive mending, high rates of dispensing punishing 
stimuli In social interactions) ; and (5) Immature (dependent, high rates 
of initiation to teacher, inadequate social and study skills). Homogeneity 
and grouping will be determined by profile analyses which indicate high 
scores on one factor and low or moderate scores on the four remaining 
factors. 

The Assessment Instrument 

The WPBIC consists of fifty stimulus items that describe observ- 
able classroom behaviors. The fifty checklist items were drawn from 
teacher descriptions of classroom behavior problems. A random sample 
of thirty experienced teachers was drawn from the population of fourth, 
fifth, and sixth grade teachers in a local (Oregon) school district. 

The teachers were then asked to nominate those children in their 
classes who exhibited chronic behavior problems. Each teacher was then 
interviewed and asked to describe the child's behavior problem (s) and 
to give operational descriptions of the behaviors that concerned them. 
Observable descriptions of overt behavior were abstracted from each 
interview, yielding an item pool of three hundred items. Fifty of the 
most frequently mentioned behaviors from this sample were selected for 
inclusion in the checklist. 

Items were assigned one of four score weights, from 1 to 4, indi- 
cating to what extent possession of a behavioral item handicaps the 
child's adjustment. Score weights were derived from a panel of behav- 
ioral scientists' ratings of the seriousness of the behavioral items 
in handicapping behavioral adjustment. Kuder-Richardson estimates of 
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the reliability o£ the WPB1C are .98 and .89 respectively (Walker, 1970). 
A test-retest reliability estimate with a one month interval yielded 
an r of .80 (Walker and Bull, 1970). The average item validity, as 
measured by correlations of individual items with total score, was .40. 
Contrasted groups validity indicates there was a statistically signif- 
icant difference between the mean score of a group of deviant children 
and a matched group of normal children (N = 38) . The biserial cor- 
relation between checklist scores and criterion scores, based upon three 
independent criteria of behavior disturbance, was .68. Consistent sex 
differences in checklist score were obtained across raters (teachers) 
and across grade levels. 

The design of this study provided for the identification of fac- 
torially homogeneous groupings of pupils on five dimensions of classroom 
behavior. It also provided an opportunity for replication of results 
obtained with the normative sample upon another, larger sample of pupils 
in grades one, two, and three. 1 



Results and Discussion 

Comparisons Between Identification and Normative Samples 

The WPBIC was standardized on a 534 pupil sample of children in 
grades four, five, and six. The identification sample consisted of 1067 
children in grades one, two, and three. Table 1 contains the means and 
standard deviations for the two samples. 



Insert Table 1 
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The difference of 3.02, In mean score, between the two samples Is sta- 
tistically significant beyond the .001 level. * The lower mean score of 
pupils In the Identification Indicates that as a group they were rated 
as less deviant by their teachers than pupils In the normative sample. 
Peterson (1961) has reported findings Indicating the presence of non- 
linear developmental changes, as measured by behavior ratings, over the 
age range kindergarten through grade six. It Is possible that the 
significant difference between the two samples reflects true develop- 
mental differences between pupils In grades ore, two, and three and 
pupils In grades four, five, and six. However, acceptance of this 
hypothesis would mean that children exhibit significantly more deviant 
behavior as they progress through school. It would appear, at present 
that there Is not enough data reported in the literature to provide 
conclusive support for this hypothesis. 

Ross, Lacey, and Parton (1965) have suggested that when age-related 
changes on teacher checklists are found, they may be a function of 
systematic differences in teachers that are correlated with the grade 
level at which they are teaching. Support for this assumption is pro- 
vided by Walker (1970) who found that teachers in grade six rated 
children in their classes as significantly less deviant than did teach- 
ers in grades four and five. 

A more plausible explanation for the consistently lower scores of 
pupils in the identification sample relates to differences in sample 
selection procedures associated with the identification study and the 
original standardization study. In the standardization study, a random 
sample of classrooms at the fourth, fifth, and sixth grade levels was 



27 



24 



drawn from the total number of elementary schools in the Eugene district. 
This procedure resulted in seven classrooms selected from each grade 
level. Teachers in the sample rated all pupils in their classrooms on 
the checklist. In the Identification study, the school district required 
that signed permission slips be obtained from each child's parent prior 
to being rated on the checklist by his teacher. Thus all teachers in 
grades one, two, and three were included in the study. The return of 
permission slips and subsequent teacher ratings varied from zero to 
approximately seventy-five percent. Substantial feedback from teachers 
in the sample suggested that permission slips were not received from 
parents of the most deviant children in their classrooms. The fact 
that scores for these children were not included in the data analysis 
could explain the consistently lower scores of children in the iden- 
tification sample. 



Insert Table 2 

Inspection of Table 2 reveals that the mean scores for pupils in grades 
four, five, and six are higher than the means for grades one, two, and 
three. The consistency of the effect across grades suggests that the 
mean scores for each grade level in the normative sample are more 
representative of the pupils' true behavioral status since they were 
based upon scores for all children enrolled in each classroom. If 
teacher reports that the more deviant children tended to be excluded 
from the sample are true; then checklist ratings on all pupils in each 
classroom in thi identification sample would have probably resulted in 
higher mean scores for each grade level. 



Intercorrelations Among the Factor Scales 



The relationships that exist between the item clusters making 
up the five factors of the WPBIC are presented in the correlation 
matrices in Table 3. 



Insert Table 3 



Table 3 contains intercorrelations among the scales for both the iden- 
tification and normative samples. The correlations indicate that with 
the exception of item clusters one and three, there is very little over- 
lap among the five factor scales in both samples. This suggests that 
the WPBIC provides measures of separate dimensions within the same 
general behavior domain, e.g., behavioral disturbance. 

The r of .67 between the acting-out and dlstractablllty factor 
scales in the normative sample and the equivalent r of .49 in the iden- 
tification sample indicates that these two dimensions share the greatest 
amount of variance of any of the five factors within the checklist. The 
content of the items in each factor supports the assumption that the two 
scales measure common elements or dimensions of behavior. In addition, 
acting out or hyperactive children often manifest very high rates of 
non-attending and distractlve behavior (Walker and Buckley, 1968; 
Patterson, Jones, Wright, and Whittier, 1965). 

Intercorrelations among the five factor scales show a high degree 
of correspondence in the normative and identification samples. The 
coefficients, in Table 3, between scales four and five and between 
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scales one and three are Identical in the two samples. The remaining 
intercorrelations £or the Identification sample closely parallel those 
for the normative sample in magnitude as well as in Relative proportion 
to one another. Thus, relationships among the factor scales appear to 
remain stable across different samples of children. 

Effects of Sex of Pupil and Grade Placement Upon Factor Scale Scores 
and Total Checklist Score 

An analysis of variance for a 2 x 3 factorial design (Winer, 1962) 
was used to analyze the effects of grade and sex upon checklist score. 
Analyses of variance were computed for total score on the checklist and 
for each of the five, factor scales. Levels of each factor were male 
versus female and grade levels one, two, and three. 

Insert Table 4 



The F ratio of 29.61 in Table. 4 indicates there was a statistically 
significant main effect for sex of pupil. The mean score for males 
across grade level was 5.97. The mean score for females was 3.63. 
Separate t tests indicated the sex difference was statistically signif- 
icant within each of the three grade levels. There was no statistically 
significant effect for grade level. The interaction between grade level 
and sex of pupil was also not significant. The respective F ratios in 
Table 4 are 1.46 and .20. The significant sex difference in checklist 
score replicates an identical result obtained in the normative sample. 
This finding is also consistent *;ith studies reported in the literature 
which Indicate consistently more deviant scores for males than females 
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in ratings of child behavior (Quay, 1970). The finding of no statis- 
tically significant differences in checklist score between grade levels 
is consistent with results reported by Ross, Lacey, and Parton (1965). 
However, Peterson (1961) has reported the presence of nonlinear develop- 
mental changes, as measured by behavior ratings, over the age levels 
kindergarten through grade six. Ross, Lacey, and Parton point out that 
their results were based upon 31 teachers at each grade level while 
Peterson's results were obtained from an average of seven teachers at 
each grade level. Similarly, mean scores in this study were based upon 
an average of 47 teachers at each grade level. The issue of whether 
true developmental differences exist across grade levels is not clear 
at the present time. Ross, Lacey, and Parton argue that this issue 
must be resolved before behavior checklist data can be considered an 
unambiguous means of assessing developmental changes in the behavior 
of children. 

Analyses of variance for each of the factor scales are presented 
in Tables 5 through 9. 



Insert Tables 5 through 9 



Inspection of the tables reveals that the statistically significant sex 
difference obtained for total checklist score held true for three of the 
five factors. Significant F ratios for sex of pupil were obtained for 
the acting out, dlstractablllty, and disturbed peer relations scales. 

The respective F ratios were 23.02, 67.19, and 4.55 respectively. The 
most powerful sex difference was associated with the acting out and 
distractability factors. Both these factors measure behaviors tliat 
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directly compete with academic performance. Thus, It appears these two 
clusters of behaviors clearly discriminate between males and females in 
the first three grades. There is considerable support for this hypothe- 
sis in the literature. Data from behavior rating scales, behavior 
checklists, and observation schedules indicate that these two response 
classes are powerful discriminators between male and female pupils. In 
addition, Buckley, Walker, Bridges, and Hendy (1970) operate,' a token 
economy classroom, over a four year period, for disturbed children. 

The most common reasons for referral were high rate* of acting-out and/ 
or dlstractable behavior. Of 65 children referred during this period, 

59 were males and 6 were females. 

Males were also rated as significantly more deviant than females 
on the cluster of items measuring disturbed peer relations. This 
factor provides a measure of the child's social relationship (s) with 
his peers. The behaviors making up the disturbed peer relations scale 
do not compete as directly with academic performance as do those com- 
prising the acting out and dlstractability scales. However, possession 
of all or a substantial majority of the behaviors in the scale would 
severely handicap a child's educational as well as behavioral adjust 
ment. If a true sex difference does exist on this cluster of behaviors, 
then it appears teachers are able to make valid discriminations on 
behaviors that directly compete with academic performance as well as 
those that handicap a child's educational adjustment in a less direct 
and more general way. 

There uas no statistically significant sex difference for the social 
withdrawal and immaturity factor scales. The F ratios for the main 
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effect of sex were .15 for social withdrawal and .06 for immaturity. 
These two factors measure behavior dimensions associated with restricted 
functioning, avoidance behavior, low rates of peer interactions, and 
peer relationships that would be classified as deficient or maladaptive 
instead of coercive or deviant. Thus, it would appear that males and 
females in grades one, two, and three share an approximately equal pro- 
bability of being rated high on these two factor scales. 

There was no significant main effect for grade level within any of 
the factor scales. In addition, there were no significant interaction 
effects for sex and grade within the five scales. Thus, the significant 
effect for sex of pupil and the absence of a significant effect for 
either grade level or interaction proved highly reliable across the 
factor scales, in this study. 




Behavioral Incidence Data 

The percentage of pupils receiving scores of one standard deviation 
above the mean (on each of the five scales) was analyzed for the iden- 
tification sample. Separate analyses were conducted for each of the 
factors. 



Insert Table 10 



Table 10 contains the percentages of male, female, and total subjects 
scoring at or above the standard deviation for each factor scale using 
the means and sigmas for the original normative sample. An average of 
5.90 percent of subjects scored at or above one sigma across the five 
scales. A z test was used to test the statistical significance of the 

- 33 



30 



percentage difference between males and females scoring at or above one 
sigma within each factor scale. Table 10 reveals that the percentage 
differences for the acting-out, distractability, and disturbed peer 
relations scales were statistically significant. The Social withdrawal 
and immaturity scale differences did not approach the levels required 
for significance. These data are consistent with the results of the 
analyses of variance of factor scores discussed earlier. 



Insert Table 11 

Table 11 contains the percentages of male, female, and total subjects 
scoring at or above one standard deviation for each factor scale using 
the means and sigmas for the identification sample. The percentages 
for male, female, and total subjects are larger due to the lower mean 
score(s) and smaller standard deviatlon(s) of the identification sample. 
However, the results in Table 11 replicate those in Table 10. An aver- 
age of 11.34 percent of the total subjects scored at or above one 
sigma across the five factors in the identification sample. This com- 
pares with 5.90 percent using the normative sample means and sigmas. 

The statistically significant percentage differences for males and 
females are also identical in Table 11. 

Summary and Conclusions 

Results of this and other studies suggest that behavior checklist 
data provided by teacher ratings of child behavior provide a valuable 
and relatively inexpensive method of identifying homogeneous groupings 
of classroom behavior. However, the practice of relying upon a single 
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teacher rating to establish homogeneity and behavior class membership 
has not been clearly validated. It would appear a more intensive 
screening process using repeated observations of actual classroom be- 
havior (with reliable observers) would be necessary to reliably deter- 
mine homogeneity. Further, the use of checklist data to measure 
developmental changes and to evaluate the effects of Intervention does 
not appear to be justified by research data presented in the literature. 

Results of this study indicate that teacher ratings of various 
classes of behavior reflect sex differences that have been validated 
in other studies. The data appear to be internally consistent and 
replicate many of the results obtained with the normative sample. 
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Footnotes 

1. A test applied to the variances of the identification sample and the 
normative sample indicated the assumption of homogeneity of variance 
underlying the t test could not be met. Boneau (1960) argues that if 
two samples have uneoual sizes and unequal variances and their respective 
distributions are skewed (as in this case), the resulting t ratios will 
also tend to be skewed and will lead to biased results. However, using 
samples of larger size tends to remove this skew (Cownie and Heath, 

1970). Further, Edwards (1967) notes that if the t test is applied to 
independent random samples of size 25 or more, the t test is relatively 
unaffected by rather severe violations of the assumptions of homogeneity 
of variance and normality of the distributions in the population. Con- 
sidering the robustness of the t test and the size of the two independent 
samples in this study, it was decided to use the t test to determine 
statistical significance of the mean difference. 
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Table 1 



Means and Standard Deviations for the 
Identification and Normative Samples 



Identification 
Sample (N-1067) 


Normative 
Sample (N=534) 






X 


S.D. 


X 


S.D. 


D. 


Critical Ratio 


4.74 


6.66 


7.76 


10.53 


(3.02) 


6.16*** 



*Signif leant at .05 
**Signif leant at .01 
***Signif leant at .001 
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Table 2 



Mean Scores for Pupils In the 
Identification and Normative Samples 



Identification 
Sample (N=1067) 



Normative 
Sample (N=534) 



Grade 

X 



4.97 



5.00 



4.20 



9.48 



8.72 



5.04 



39 



Table 3 



Intercorrelations of the Five WPBIC 
Factor Scales in the Identification 
Sample and the Normative Sample* 





Acting- 


Social 


Distracta- 


Disturbed Peer 








Out 


Withdrawal 


bility 


Relations 


Immaturity 


Acting-Out 


— 


.02 (.09) 


.67 (.49) 


.48 (.37) 


.39 


(.28) 


Soc lal 
Wlthdravral 






.12 (.12) 


.18 (.28) 


.23 


(.32) 


Distracta- 

bility 








.43 (.31) 


.44 


(.28) 


Disturbed Peer 
Relations 










.34 


(.34) 


Immaturity 












— 



*Intercorrelations within parentheses are for the identification sample. 
Unenclosed coefficients are for the normative sample. 



EKIC 



43 
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Table 4 



Analysis of Variance for a 2 x 3 
Factorial Design i Total Checklist Score 



Source 


SS 


DF 


MS 


F 


Total 


47,274 


1,066 


— 


— 


Sex of pupil (A) 


1,27S 


1 


1,270 


29.61*** 


Grade level (D) 


126 


2 


63 


1.46 


A x B 


17 


2 


8 


.20 


Error 


45,852 


1,061 


43 




*Signlf leant at .05 
**Signif leant at .01 
***Significant at .001 
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